Goto

Collaborating Authors

 k-means seeding


Reviews: K-Medoids For K-Means Seeding

Neural Information Processing Systems

The authors propose to use a particular version of the K-medoids algorithm (clarans - that uses iterative swaps to identify the medoids) for initializing k-means and claim that this improves the final clustering quality. The authors have also tested their claims with multiple datasets, and demonstrated their performance improvements. They have also published code that will be made open after the review process. The paper is easy to read and follow, and the authors have done a good job placing their work in context. I appreciate the fact that the optimizations are presented in a very accessible manner in Section 4. As the authors claim, open source code is an important contribution.


K-Medoids For K-Means Seeding

Newling, James, Fleuret, François

Neural Information Processing Systems

We show experimentally that the algorithm CLARANS of Ng and Han (1994) finds better K-medoids solutions than the Voronoi iteration algorithm of Hastie et al. (2001). This finding, along with the similarity between the Voronoi iteration algorithm and Lloyd's K-means algorithm, motivates us to use CLARANS as a K-means initializer. We show that CLARANS outperforms other algorithms on 23/23 datasets with a mean decrease over k-means of 30% for initialization mean squared error (MSE) and 3% for final MSE. We introduce algorithmic improvements to CLARANS which improve its complexity and runtime, making it a viable initialization scheme for large datasets. Papers published at the Neural Information Processing Systems Conference.